Outlier Detection Using Ball Descriptions with Adjustable Metric

نویسندگان

  • David M. J. Tax
  • Piotr Juszczak
  • Elzbieta Pekalska
  • Robert P. W. Duin
چکیده

Sometimes novel or outlier data has to be detected. The outliers may indicate some interesting rare event, or they should be disregarded because they cannot be reliably processed further. In the ideal case that the objects are represented by very good features, the genuine data forms a compact cluster and a good outlier measure is the distance to the cluster center. This paper proposes three new formulations to find a good cluster center together with an optimized p-distance measure. Experiments show that for some real world datasets very good classification results are obtained and that, more specifically, the 1-distance is particularly suited for datasets containing discrete feature values.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Number 3

Outlier Detection is a critical and cardinal research task due its array of applications in variety of domains ranging from data mining, clustering, statistical analysis, fraud detection, network intrusion detection and diagnosis of diseases etc. Over the last few decades, distance-based outlier detection algorithms have gained significant reputation as a viable alternative to the more traditio...

متن کامل

Detection of Peculiar Word Sense by Distance Metric Learning with Labeled Examples

For natural language processing on machines, resolving such peculiar usages would be particularly useful in constructing a dictionary and dataset for word sense disambiguation. Hence, it is necessary to develop a method to detect such peculiar examples of a target word from a corpus. Note that, hereinafter, we define a peculiar example as an instance in which the target word or phrase has a new...

متن کامل

Identification of outliers types in multivariate time series using genetic algorithm

Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...

متن کامل

Initialization of K-modes clustering using outlier detection techniques

The K-modes clustering has received much attention, since it works well for categorical data sets. However, the performance of K-modes clustering is especially sensitive to the selection of initial cluster centers. Therefore, choosing the proper initial cluster centers is a key step for K-modes clustering. In this paper, we consider the initialization of K-modes clustering from the view of outl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006